Overview

Dataset statistics

Number of variables10
Number of observations442
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory34.7 KiB
Average record size in memory80.3 B

Variable types

NUM9
CAT1

Reproduction

Analysis started2020-06-04 15:34:28.237059
Analysis finished2020-06-04 15:34:49.813021
Versionpandas-profiling v2.6.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Variables

age
Real number (ℝ)

Distinct count58
Unique (%)13.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-3.634284929e-16
Minimum-0.1072256316
Maximum0.1107266755
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum-0.1072256316
5-th percentile-0.0854304009
Q1-0.03729926643
median0.005383060374
Q30.03807590643
95-th percentile0.07076875249
Maximum0.1107266755
Range0.2179523071
Interquartile range (IQR)0.07537517286

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-1.31027282e+14
Kurtosis-0.6712236886
Mean-3.634284929e-16
Median Absolute Deviation (MAD)0.03929479746
Skewness-0.231381533
Sum-1.608713163e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.10722563 -0.0618189 -0.01096336 0.05078979 0.07621756 0.11072668], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.01628067573 19 4.3%
 
0.04170844488 17 3.8%
 
0.009015598825 16 3.6%
 
-0.02730978568 15 3.4%
 
0.04534098334 14 3.2%
 
0.01264813728 14 3.2%
 
-0.05273755484 14 3.2%
 
-0.001882016528 14 3.2%
 
0.005383060374 13 2.9%
 
0.06713621404 13 2.9%
 
Other values (48) 293 66.3%
 
ValueCountFrequency (%) 
-0.1072256316 3 0.7%
 
-0.1035930932 3 0.7%
 
-0.09996055471 2 0.5%
 
-0.09632801625 4 0.9%
 
-0.0926954778 4 0.9%
 
ValueCountFrequency (%) 
0.1107266755 2 0.5%
 
0.09619652165 2 0.5%
 
0.0925639832 1 0.2%
 
0.08893144475 1 0.2%
 
0.0852989063 1 0.2%
 

sex
Categorical

Distinct count2
Unique (%)0.5%
Missing0
Missing (%)0.0%
Memory size3.6 KiB
-0.04464163651
235
0.05068011874
207
ValueCountFrequency (%) 
-0.04464163651 235 53.2%
 
0.05068011874 207 46.8%
 

Length

Max length18
Mean length18
Min length18
ValueCountFrequency (%) 
Decimal_Number 9 81.8%
 
Dash_Punctuation 1 9.1%
 
Other_Punctuation 1 9.1%
 
ValueCountFrequency (%) 
Common 11 100.0%
 
ValueCountFrequency (%) 
ASCII 11 100.0%
 

bmi
Real number (ℝ)

Distinct count163
Unique (%)36.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-8.045349203e-16
Minimum-0.0902752959
Maximum0.170555226
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum-0.0902752959
5-th percentile-0.06656343027
Q1-0.03422906806
median-0.00728376621
Q30.03124801543
95-th percentile0.08540807214
Maximum0.170555226
Range0.2608305219
Interquartile range (IQR)0.06547708349

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-5.918829179e+13
Kurtosis0.09509447428
Mean-8.045349203e-16
Median Absolute Deviation (MAD)0.03835843377
Skewness0.5981484879
Sum-3.54216656e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.0902753 -0.07033577 -0.04231266 0.0078056 0.07193542 0.09780291 0.17055523], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-0.02452875939 8 1.8%
 
-0.03099563184 8 1.8%
 
-0.04608500087 7 1.6%
 
-0.008361578284 7 1.6%
 
-0.02560657147 7 1.6%
 
0.01427247527 6 1.4%
 
-0.03315125598 6 1.4%
 
-0.02345094732 6 1.4%
 
0.001338730381 6 1.4%
 
-0.0202175111 6 1.4%
 
Other values (153) 375 84.8%
 
ValueCountFrequency (%) 
-0.0902752959 1 0.2%
 
-0.08919748382 1 0.2%
 
-0.08488623553 1 0.2%
 
-0.08380842346 1 0.2%
 
-0.08165279931 2 0.5%
 
ValueCountFrequency (%) 
0.170555226 1 0.2%
 
0.1608549173 1 0.2%
 
0.1371430517 1 0.2%
 
0.1285205551 1 0.2%
 
0.127442743 1 0.2%
 

bp
Real number (ℝ)

Distinct count100
Unique (%)22.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.281654521e-16
Minimum-0.1123996021
Maximum0.1320442172
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum-0.1123996021
5-th percentile-0.07435588089
Q1-0.0366564468
median-0.005670610555
Q30.03564383777
95-th percentile0.08367188395
Maximum0.1320442172
Range0.2444438193
Interquartile range (IQR)0.07230028457

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)3.715435543e+14
Kurtosis-0.5327797228
Mean1.281654521e-16
Median Absolute Deviation (MAD)0.03928220463
Skewness0.2906638512
Sum5.700995231e-14
Variance0.002267573696
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.1123996 -0.07624946 -0.04067313 -0.03837788 -0.00681823 -0.0050968 0.07179398 0.10622269 0.13204422], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-0.005670610555 21 4.8%
 
-0.04009931749 21 4.8%
 
-0.02632783472 20 4.5%
 
0.02187235499 15 3.4%
 
-0.0332135761 14 3.2%
 
-0.02288496402 13 2.9%
 
-0.01255635194 11 2.5%
 
0.04941532054 11 2.5%
 
-0.01599922264 11 2.5%
 
0.00810087222 11 2.5%
 
Other values (90) 294 66.5%
 
ValueCountFrequency (%) 
-0.1123996021 1 0.2%
 
-0.1089567314 1 0.2%
 
-0.10207099 1 0.2%
 
-0.1009233664 1 0.2%
 
-0.09862811929 1 0.2%
 
ValueCountFrequency (%) 
0.1320442172 1 0.2%
 
0.1251584758 1 0.2%
 
0.1079441223 3 0.7%
 
0.1045012516 2 0.5%
 
0.101058381 1 0.2%
 

s1
Real number (ℝ)

Distinct count141
Unique (%)31.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-8.835315586e-17
Minimum-0.1267806699
Maximum0.1539137132
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum-0.1267806699
5-th percentile-0.07311850845
Q1-0.0342478402
median-0.004320865537
Q30.02835801485
95-th percentile0.08367131975
Maximum0.1539137132
Range0.2806943831
Interquartile range (IQR)0.06260585505

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-5.389626115e+14
Kurtosis0.2329479047
Mean-8.835315586e-17
Median Absolute Deviation (MAD)0.03736655352
Skewness0.3781082069
Sum-3.996802889e-14
Variance0.002267573696
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.12678067 -0.07793434 -0.05041529 0.02526212 0.06585273 0.09268381 0.15391371], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-0.007072771253 10 2.3%
 
-0.03734373413 10 2.3%
 
0.02044628591 9 2.0%
 
0.01219056876 9 2.0%
 
0.001182945896 8 1.8%
 
-0.002944912678 8 1.8%
 
-0.02496015841 8 1.8%
 
-0.004320865537 8 1.8%
 
0.02457414449 8 1.8%
 
-0.005696818395 7 1.6%
 
Other values (131) 357 80.8%
 
ValueCountFrequency (%) 
-0.1267806699 1 0.2%
 
-0.1088932828 1 0.2%
 
-0.1047654242 1 0.2%
 
-0.1033894713 1 0.2%
 
-0.1006375656 1 0.2%
 
ValueCountFrequency (%) 
0.1539137132 1 0.2%
 
0.1525377603 1 0.2%
 
0.1332744203 1 0.2%
 
0.1277706089 2 0.5%
 
0.126394656 1 0.2%
 

s2
Real number (ℝ)

Distinct count302
Unique (%)68.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.327024212e-16
Minimum-0.115613066
Maximum0.1987879897
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum-0.115613066
5-th percentile-0.07271172671
Q1-0.03035839726
median-0.003819065121
Q30.02984439452
95-th percentile0.07946276829
Maximum0.1987879897
Range0.3144010556
Interquartile range (IQR)0.06020279178

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)3.588408349e+14
Kurtosis0.6013811504
Mean1.327024212e-16
Median Absolute Deviation (MAD)0.03748822467
Skewness0.4365918037
Sum5.750955268e-14
Variance0.002267573696
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.11561307 -0.08179303 -0.03748252 0.02185911 0.05771461 0.09310038 0.19878799], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.01622243643 5 1.1%
 
-0.001000728964 5 1.1%
 
-0.02480001206 4 0.9%
 
-0.04703355285 4 0.9%
 
-0.0138398159 4 0.9%
 
0.056618588 4 0.9%
 
-0.003819065121 3 0.7%
 
-0.02323426975 3 0.7%
 
-0.01571870667 3 0.7%
 
0.006201685657 3 0.7%
 
Other values (292) 404 91.4%
 
ValueCountFrequency (%) 
-0.115613066 1 0.2%
 
-0.1127947298 1 0.2%
 
-0.106844909 1 0.2%
 
-0.1043397214 1 0.2%
 
-0.1008950883 1 0.2%
 
ValueCountFrequency (%) 
0.1987879897 1 0.2%
 
0.1558866504 1 0.2%
 
0.1314610704 1 0.2%
 
0.1302084765 1 0.2%
 
0.1280164373 1 0.2%
 

s3
Real number (ℝ)

Distinct count63
Unique (%)14.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-4.574646343e-16
Minimum-0.1023070505
Maximum0.1811790604
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum-0.1023070505
5-th percentile-0.06549067248
Q1-0.03511716059
median-0.006584467611
Q30.02931150098
95-th percentile0.07790911999
Maximum0.1811790604
Range0.2834861109
Interquartile range (IQR)0.06442866157

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-1.040933966e+14
Kurtosis0.9815074614
Mean-4.574646343e-16
Median Absolute Deviation (MAD)0.03751765764
Skewness0.7992551183
Sum-2.017275236e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.10230705 -0.0783764 -0.0489233 0.02102782 0.07993402 0.18117906], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-0.01394774322 22 5.0%
 
-0.04340084565 19 4.3%
 
-0.03971920785 18 4.1%
 
-0.002902829807 15 3.4%
 
-0.03235593224 15 3.4%
 
0.008142083605 15 3.4%
 
-0.02131101883 15 3.4%
 
-0.02867429444 15 3.4%
 
-0.006584467611 14 3.2%
 
0.01550535921 14 3.2%
 
Other values (53) 280 63.3%
 
ValueCountFrequency (%) 
-0.1023070505 1 0.2%
 
-0.09862541271 1 0.2%
 
-0.09126213711 1 0.2%
 
-0.08021722369 2 0.5%
 
-0.07653558589 5 1.1%
 
ValueCountFrequency (%) 
0.1811790604 1 0.2%
 
0.1774974226 1 0.2%
 
0.1738157848 1 0.2%
 
0.1590892336 1 0.2%
 
0.151725958 1 0.2%
 

s4
Real number (ℝ)

Distinct count66
Unique (%)14.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.777301498e-16
Minimum-0.07639450375
Maximum0.1852344433
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum-0.07639450375
5-th percentile-0.07639450375
Q1-0.03949338287
median-0.002592261998
Q30.03430885888
95-th percentile0.08076737006
Maximum0.1852344433
Range0.261628947
Interquartile range (IQR)0.07380224175

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)1.260663139e+14
Kurtosis0.4444016718
Mean3.777301498e-16
Median Absolute Deviation (MAD)0.03710343824
Skewness0.7353736479
Sum1.707523012e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.0763945 -0.07362692 -0.043368 -0.03857085 -0.00480633 ... 0.03523139 0.06438327 0.07157899 0.11918144 0.18523444], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-0.03949338287 128 29.0%
 
-0.002592261998 108 24.4%
 
0.03430885888 68 15.4%
 
0.07120997975 33 7.5%
 
-0.07639450375 28 6.3%
 
0.1081111006 13 2.9%
 
0.1450122215 2 0.5%
 
-0.02141183364 2 0.5%
 
-0.03764832683 2 0.5%
 
0.01585829844 2 0.5%
 
Other values (56) 56 12.7%
 
ValueCountFrequency (%) 
-0.07639450375 28 6.3%
 
-0.07085933562 1 0.2%
 
-0.06938329078 1 0.2%
 
-0.05351580881 1 0.2%
 
-0.05167075276 1 0.2%
 
ValueCountFrequency (%) 
0.1852344433 1 0.2%
 
0.1553445354 1 0.2%
 
0.1450122215 2 0.5%
 
0.1413221094 1 0.2%
 
0.1302517732 1 0.2%
 

s5
Real number (ℝ)

Distinct count184
Unique (%)41.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-3.830854217e-16
Minimum-0.1260973856
Maximum0.13359898
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum-0.1260973856
5-th percentile-0.0721284546
Q1-0.03324878725
median-0.001947634157
Q30.03243322578
95-th percentile0.07904666678
Maximum0.13359898
Range0.2596963656
Interquartile range (IQR)0.06568201303

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-1.243039931e+14
Kurtosis-0.1343658334
Mean-3.830854217e-16
Median Absolute Deviation (MAD)0.03873332127
Skewness0.2917738324
Sum-1.700861674e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.12609739 -0.07611489 -0.04327875 0.04639655 0.0850142 0.13349736 0.13359898], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
-0.01811826731 11 2.5%
 
-0.03075120986 10 2.3%
 
-0.04118038519 8 1.8%
 
-0.02595242444 7 1.6%
 
-0.05140053526 7 1.6%
 
-0.03324878725 7 1.6%
 
-0.02364455757 6 1.4%
 
-0.01090443585 6 1.4%
 
-0.06117659509 6 1.4%
 
0.01556684454 6 1.4%
 
Other values (174) 368 83.3%
 
ValueCountFrequency (%) 
-0.1260973856 1 0.2%
 
-0.1043648208 1 0.2%
 
-0.1016435479 1 0.2%
 
-0.09643322289 4 0.9%
 
-0.09393564551 1 0.2%
 
ValueCountFrequency (%) 
0.13359898 2 0.5%
 
0.1333957338 1 0.2%
 
0.1323726493 1 0.2%
 
0.1300806095 1 0.2%
 
0.1290194116 1 0.2%
 

s6
Real number (ℝ)

Distinct count56
Unique (%)12.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-3.412882015e-16
Minimum-0.1377672257
Maximum0.1356118307
Zeros0
Zeros (%)0.0%
Memory size3.6 KiB

Quantile statistics

Minimum-0.1377672257
5-th percentile-0.07563562197
Q1-0.03317902609
median-0.0010776975
Q30.0279170509
95-th percentile0.0817644408
Maximum0.1356118307
Range0.2733790564
Interquartile range (IQR)0.06109607699

Descriptive statistics

Standard deviation0.04761904762
Coefficient of variation (CV)-1.395273772e+14
Kurtosis0.2369167379
Mean-3.412882015e-16
Median Absolute Deviation (MAD)0.03704056698
Skewness0.2079166162
Sum-1.502131752e-13
Variance0.002267573696
Histogram with fixed size bins (bins=10)
Histogram with variable size bins (bins=[-0.13776723 -0.0942751 -0.06113825 0.04241443 0.0879776 0.13561183], "bayesian blocks" binning strategy used)
ValueCountFrequency (%) 
0.003064409414 22 5.0%
 
0.01963283707 20 4.5%
 
0.007206516329 20 4.5%
 
-0.0010776975 19 4.3%
 
-0.01764612516 16 3.6%
 
-0.01350401824 16 3.6%
 
-0.03835665973 15 3.4%
 
-0.00936191133 14 3.2%
 
-0.005219804415 14 3.2%
 
0.01549073016 14 3.2%
 
Other values (46) 272 61.5%
 
ValueCountFrequency (%) 
-0.1377672257 1 0.2%
 
-0.1294830119 2 0.5%
 
-0.1046303704 2 0.5%
 
-0.09634615654 2 0.5%
 
-0.09220404963 4 0.9%
 
ValueCountFrequency (%) 
0.1356118307 3 0.7%
 
0.1314697238 2 0.5%
 
0.1273276169 1 0.2%
 
0.119043403 2 0.5%
 
0.1066170823 4 0.9%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Missing values

Sample

First rows

agesexbmibps1s2s3s4s5s6
00.0380760.0506800.0616960.021872-0.044223-0.034821-0.043401-0.0025920.019908-0.017646
1-0.001882-0.044642-0.051474-0.026328-0.008449-0.0191630.074412-0.039493-0.068330-0.092204
20.0852990.0506800.044451-0.005671-0.045599-0.034194-0.032356-0.0025920.002864-0.025930
3-0.089063-0.044642-0.011595-0.0366560.0121910.024991-0.0360380.0343090.022692-0.009362
40.005383-0.044642-0.0363850.0218720.0039350.0155960.008142-0.002592-0.031991-0.046641
5-0.092695-0.044642-0.040696-0.019442-0.068991-0.0792880.041277-0.076395-0.041180-0.096346
6-0.0454720.050680-0.047163-0.015999-0.040096-0.0248000.000779-0.039493-0.062913-0.038357
70.0635040.050680-0.0018950.0666300.0906200.1089140.0228690.017703-0.0358170.003064
80.0417080.0506800.061696-0.040099-0.0139530.006202-0.028674-0.002592-0.0149560.011349
9-0.070900-0.0446420.039062-0.033214-0.012577-0.034508-0.024993-0.0025920.067736-0.013504

Last rows

agesexbmibps1s2s3s4s5s6
4320.009016-0.0446420.055229-0.0056710.0575970.044719-0.0029030.0232390.0556840.106617
433-0.027310-0.044642-0.060097-0.0297710.0465890.0199800.122273-0.039493-0.051401-0.009362
4340.016281-0.0446420.0013390.0081010.0053110.0108990.030232-0.039493-0.0454210.032059
435-0.012780-0.044642-0.023451-0.040099-0.0167040.004636-0.017629-0.002592-0.038459-0.038357
436-0.056370-0.044642-0.074108-0.050428-0.024960-0.0470340.092820-0.076395-0.061177-0.046641
4370.0417080.0506800.0196620.059744-0.005697-0.002566-0.028674-0.0025920.0311930.007207
438-0.0055150.050680-0.015906-0.0676420.0493410.079165-0.0286740.034309-0.0181180.044485
4390.0417080.050680-0.0159060.017282-0.037344-0.013840-0.024993-0.011080-0.0468790.015491
440-0.045472-0.0446420.0390620.0012150.0163180.015283-0.0286740.0265600.044528-0.025930
441-0.045472-0.044642-0.073030-0.0814140.0837400.0278090.173816-0.039493-0.0042200.003064